-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More general structure for loop parsing #91
Conversation
reg1 = p.group("reg1") | ||
imm = p.group("imm") | ||
state = 2 | ||
if additional_data is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you say what the intended logic is here? Are you expecting this to be set by the first end-regexp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the first end-regexp sets additional_data
. I made this choice because that's usually the subs
or cmp
in which we can see the counter register and/or decrement.
I realize that maybe I want to get additional_data
for each instruction in the "end"-part and then merge the dictionaries in the end. I'll change that.
slothy/targets/arm_v81m/arch_v81m.py
Outdated
for loop_type in Loop.__subclasses__(): | ||
try: | ||
l = loop_type(lbl) | ||
return l._extract(source, lbl) + (l,) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ... + (l,)
may merit a comment. What's happening here? While you are at it, could you document _extract()
the return tuple given by _extract()
in a brief docstring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. (l,)
transforms l
into a tuple and merges it with the return-tuple from _extract
.
slothy/targets/arm_v81m/arch_v81m.py
Outdated
raise FatalParsingException(f"Couldn't identify loop {lbl}") | ||
|
||
class LeLoop(Loop): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Add a docstring giving an example for the type of loop recognized here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @dop-amin, overall this looks good -- better flexibility for loop forms has been an embarrassing gap for a while.
While you're at it, could you improve the doc a bit, and hoist the abstract Loop
class somewhere where it can be shared between architecture models?
Also, would you mind adding minimal examples to examples.py
which demonstrate each loop form, so we run them in CI?
@hanno-becker I think the PR now is a good starting point for abstraction of loop handling. However, I think there may pop up new cases in the future which will require slight tweaks, e.g., passing more/different inputs to the loop subclasses. I already pass some data where I know it will be useful from our experiments with Armv7m, esp. for more complicated loop constructions that check against a pointer that is modified inside the kernel. Are there any tests you'd like me to run except CI? I fully optimized one aarch64 example already and the output code still passes the test. |
* Add file for common code between models
cf17c29
to
899328b
Compare
I think this should all be done by now. |
``` | ||
loop_lbl: | ||
{code} | ||
le <cnt>, loop_lbl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation is assuming that cnt
increases by 1
per iteration, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be broken through SW pipelining, or not, if the loop count increment is chosen as an early instruction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, my fault -- this is already part of LE
, I had forgotten.
"""Locate a loop with start label `lbl` in `source`. | ||
|
||
We currently only support the following loop forms: | ||
yield f"{indent}sub {other['cnt']}, {other['cnt']}, {other['imm']}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use the same format that the original loop had? With/without flag, and potentially using cbnz
, bnz
, bne
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pre-existing, so let's not block the PR because of it.
This PR adds a more generic strucutre to deal with loops: Subclasses of Loop implement a specific type of loop, for exmaple having a certain sequence of instructions at the end. The extraction works the same and is thus implemented in
Loop
, while the methods to produce the code differ.I am open to suggestions for how to improve this approach; for my usecases wrt. armv7m it worked fine like this.